Goto

Collaborating Authors

 leader agent


Reaching Resilient Leader-Follower Consensus in Time-Varying Networks via Multi-Hop Relays

arXiv.org Artificial Intelligence

We study resilient leader-follower consensus of multi-agent systems (MASs) in the presence of adversarial agents, where agents' communication is modeled by time-varying topologies. The objective is to develop distributed algorithms for the nonfaulty/normal followers to track an arbitrary reference value propagated by a set of leaders while they are in interaction with the unknown adversarial agents. Our approaches are based on the weighted mean subsequence reduced (W-MSR) algorithms with agents being capable to communicate with multi-hop neighbors. Our algorithms can handle agents possessing first-order and second-order dynamics. Moreover, we characterize necessary and sufficient graph conditions for our algorithms to succeed by the novel notion of jointly robust following graphs. Our graph condition is tighter than the sufficient conditions in the literature when agents use only one-hop communication (without relays). Using multi-hop relays, we can enhance robustness of leader-follower networks without increasing communication links and obtain further relaxed graph requirements for our algorithms to succeed. Numerical examples are given to verify the efficacy of our algorithms.


360$^\circ$REA: Towards A Reusable Experience Accumulation with 360{\deg} Assessment for Multi-Agent System

arXiv.org Artificial Intelligence

Large language model agents have demonstrated remarkable advancements across various complex tasks. Recent works focus on optimizing the agent team or employing self-reflection to iteratively solve complex tasks. Since these agents are all based on the same LLM, only conducting self-evaluation or removing underperforming agents does not substantively enhance the capability of the agents. We argue that a comprehensive evaluation and accumulating experience from evaluation feedback is an effective approach to improving system performance. In this paper, we propose Reusable Experience Accumulation with 360$^\circ$ Assessment (360$^\circ$REA), a hierarchical multi-agent framework inspired by corporate organizational practices. The framework employs a novel 360$^\circ$ performance assessment method for multi-perspective performance evaluation with fine-grained assessment. To enhance the capability of agents in addressing complex tasks, we introduce dual-level experience pool for agents to accumulate experience through fine-grained assessment. Extensive experiments on complex task datasets demonstrate the effectiveness of 360$^\circ$REA.


Collision-free Source Seeking and Flocking Control of Multi-agents with Connectivity Preservation

arXiv.org Artificial Intelligence

We present a distributed source-seeking and flocking control method for networked multi-agent systems with non-holonomic constraints. Based solely on identical on-board sensor systems, which measure the source local field, the group objective is attained by appointing a leader agent to seek the source while the remaining follower agents form a cohesive flocking with their neighbors using a distributed flocking control law in a connectivity-preserved undirected network. To guarantee the safe separation and group motion for all agents and to solve the conflicts with the "cohesion" flocking rule of Reynolds, the distributed control algorithm is solved individually through quadratic-programming optimization problem with constraints, which guarantees the inter-agent collision avoidance and connectivity preservation. Stability analysis of the closed-loop system is presented and the efficacy of the methods is shown in simulation results.


Distributed Optimal Formation Control for an Uncertain Multiagent System in the Plane

arXiv.org Artificial Intelligence

In this paper, we present a distributed optimal multiagent control scheme for quadrotor formation tracking under localization errors. Our control architecture is based on a leader-follower approach, where a single leader quadrotor tracks a desired trajectory while the followers maintain their relative positions in a triangular formation. We begin by modeling the quadrotors as particles in the YZ-plane evolving under dynamics with uncertain state information. Next, by formulating the formation tracking task as an optimization problem -- with a constraint-augmented Lagrangian subject to dynamic constraints -- we solve for the control law that leads to an optimal solution in the control and trajectory error cost-minimizing sense. Results from numerical simulations show that for the planar quadrotor model considered -- with uncertainty in sensor measurements modeled as Gaussian noise -- the resulting optimal control is able to drive each agent to achieve the desired global objective: leader trajectory tracking with formation maintenance. Finally, we evaluate the performance of the control law using the tracking and formation errors of the multiagent system.


Learning Altruistic Behaviours in Reinforcement Learning without External Rewards

arXiv.org Artificial Intelligence

Can artificial agents learn to assist others in achieving their goals without knowing what those goals are? Generic reinforcement learning agents could be trained to behave altruistically towards others by rewarding them for altruistic behaviour, i.e., rewarding them for benefiting other agents in a given situation. Such an approach assumes that other agents' goals are known so that the altruistic agent can cooperate in achieving those goals. However, explicit knowledge of other agents' goals is often difficult to acquire. Even assuming such knowledge to be given, training of altruistic agents would require manually-tuned external rewards for each new environment. Thus, it is beneficial to develop agents that do not depend on external supervision and can learn altruistic behaviour in a task-agnostic manner. Assuming that other agents rationally pursue their goals, we hypothesize that giving them more choices will allow them to pursue those goals better. Some concrete examples include opening a door for others or safeguarding them to pursue their objectives without interference. We formalize this concept and propose an altruistic agent that learns to increase the choices another agent has by maximizing the number of states that the other agent can reach in its future. We evaluate our approach on three different multi-agent environments where another agent's success depends on the altruistic agent's behaviour. Finally, we show that our unsupervised agents can perform comparably to agents explicitly trained to work cooperatively. In some cases, our agents can even outperform the supervised ones.


Using Reinforcement Learning to Herd a Robotic Swarm to a Target Distribution

arXiv.org Artificial Intelligence

In this paper, we present a reinforcement learning approach to designing a control policy for a "leader'' agent that herds a swarm of "follower'' agents, via repulsive interactions, as quickly as possible to a target probability distribution over a strongly connected graph. The leader control policy is a function of the swarm distribution, which evolves over time according to a mean-field model in the form of an ordinary difference equation. The dependence of the policy on agent populations at each graph vertex, rather than on individual agent activity, simplifies the observations required by the leader and enables the control strategy to scale with the number of agents. Two Temporal-Difference learning algorithms, SARSA and Q-Learning, are used to generate the leader control policy based on the follower agent distribution and the leader's location on the graph. A simulation environment corresponding to a grid graph with 4 vertices was used to train and validate the control policies for follower agent populations ranging from 10 to 100. Finally, the control policies trained on 100 simulated agents were used to successfully redistribute a physical swarm of 10 small robots to a target distribution among 4 spatial regions.